Data minimization for GDPR compliance in machine learning models
نویسندگان
چکیده
The EU General Data Protection Regulation (GDPR) and the California Privacy Rights Act (CPRA) mandate principle of data minimization, which requires that only necessary to fulfill a certain purpose be collected. However, it can often difficult determine minimal amount required, especially in complex machine learning models such as deep neural networks. We present first-of-a-kind method reduce personal needed perform predictions with model, by removing or generalizing some input features runtime data. Our makes use knowledge encoded within model produce generalization has little no impact on its accuracy, based distillation approaches. show that, cases, less may collected while preserving exact same level accuracy before, if small deviation is allowed, even more generalizations performed. also demonstrate when collecting dynamically, further improved. This enables organizations truly minimize collected, thus fulfilling minimization requirement set out regulations.
منابع مشابه
Machine Learning Models for Housing Prices Forecasting using Registration Data
This article has been compiled to identify the best model of housing price forecasting using machine learning methods with maximum accuracy and minimum error. Five important machine learning algorithms are used to predict housing prices, including Nearest Neighbor Regression Algorithm (KNNR), Support Vector Regression Algorithm (SVR), Random Forest Regression Algorithm (RFR), Extreme Gradient B...
متن کاملStatic Analysis for GDPR Compliance
Information systems might access, manage and record sensitive data about citizens. In addition, the pervasiveness of these systems is dramatically increasing and increasing thanks to the mobile and the IoT revolutions. However, several unintended data breaches are reported every week, and this might compromise the privacy, safety, and security of citizens. For all these reasons, the European Pa...
متن کاملModelling Provenance for GDPR Compliance using Linked Open Data Vocabularies
The upcoming General Data Protection Regulation (GDPR) requires justification of data activities to acquire, use, share, and store data using consent obtained from the user. Failure to comply may result in significant heavy fines which incentivises creation and maintenance of records for all activities involving consent and data. Compliance documentation therefore requires provenance informatio...
متن کاملSome HCI Priorities for GDPR-Compliant Machine Learning
The General Data Protection Regulation: An Opportunity for the CHI Community? (CHI-GDPR 2018), Workshop at ACM CHI’18, 22 April 2018, Montréal, Canada Abstract In this short paper, we consider the roles of HCI in enabling the better governance of consequential machine learning systems using the rights and obligations laid out in the recent 2016 EU General Data Protection Regulation (GDPR)—a law...
متن کاملa new approach to credibility premium for zero-inflated poisson models for panel data
هدف اصلی از این تحقیق به دست آوردن و مقایسه حق بیمه باورمندی در مدل های شمارشی گزارش نشده برای داده های طولی می باشد. در این تحقیق حق بیمه های پبش گویی بر اساس توابع ضرر مربع خطا و نمایی محاسبه شده و با هم مقایسه می شود. تمایل به گرفتن پاداش و جایزه یکی از دلایل مهم برای گزارش ندادن تصادفات می باشد و افراد برای استفاده از تخفیف اغلب از گزارش تصادفات با هزینه پائین خودداری می کنند، در این تحقیق ...
15 صفحه اولذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: AI and ethics
سال: 2021
ISSN: ['2730-5953', '2730-5961']
DOI: https://doi.org/10.1007/s43681-021-00095-8